贝叶斯优化(BO)方法试图找到目标功能的全球最佳功能,这些功能仅作为黑盒或昂贵的评估。这样的方法为目标函数构建了替代模型,从而量化了通过贝叶斯推论的替代物中的不确定性。客观评估是通过在每个步骤中最大化采集函数来依次确定的。但是,由于采集函数的非转换性,尤其是在批处理贝叶斯优化的情况下,该辅助优化问题可能是高度不平凡的,因此可以解决。在这项工作中,我们将批处理重新定义为在概率措施空间上的优化问题。我们基于多点预期改进来构建一个新的采集函数,该功能是概率度量空间的凸面。解决此“内部”优化问题的实用方案自然会作为该目标函数的梯度流。我们证明了这种新方法对不同基准函数的功效,并与最先进的批次BO方法进行了比较。
translated by 谷歌翻译
本文提出了一种新的加速马尔可夫链蒙特卡洛(MCMC)方法,以在成像逆问题中有效地执行贝叶斯计算。所提出的方法源自兰格文扩散过程,并源于紧密整合两个最先进的近端Langevin MCMC采样器,SK-ROCK和SPLIST GIBBS采样(SGS),它们采用明显不同的策略来提高收敛速度。更确切地说,我们在Langevin扩散过程的水平上展示了如何集成基于随机的Runge-Kutta-chebyshev扩散的近端SK-ROCK采样器,该采样器具有模型增强和放松策略,可用于扩散以牺牲渐近偏差为代价加快贝叶斯计算的速度。这导致了一种新的,更快的近端SK-ROCK采样器,将原始SK-Rock采样器的加速质量与增强和放松的计算益处相结合。此外,我们建议将增强和放松的模型视为目标模型的近似值,而是将放松定位在偏见 - 差异权衡中,而是建议将增强和放松的模型视为目标模型的概括。然后,这使我们能够仔细校准放松量,以同时提高模型的准确性(通过模型证据衡量)和采样器的收敛速度。为了实现这一目标,我们得出了一种经验性的贝叶斯方法,可以通过最大的边际似然估计自动估计最佳的松弛量。通过与图像脱毛和内化相关的一系列数值实验,以及与艺术状态的替代方法进行比较,证明了所提出的方法。
translated by 谷歌翻译
Differentiable Architecture Search (DARTS) has attracted considerable attention as a gradient-based Neural Architecture Search (NAS) method. Since the introduction of DARTS, there has been little work done on adapting the action space based on state-of-art architecture design principles for CNNs. In this work, we aim to address this gap by incrementally augmenting the DARTS search space with micro-design changes inspired by ConvNeXt and studying the trade-off between accuracy, evaluation layer count, and computational cost. To this end, we introduce the Pseudo-Inverted Bottleneck conv block intending to reduce the computational footprint of the inverted bottleneck block proposed in ConvNeXt. Our proposed architecture is much less sensitive to evaluation layer count and outperforms a DARTS network with similar size significantly, at layer counts as small as 2. Furthermore, with less layers, not only does it achieve higher accuracy with lower GMACs and parameter count, GradCAM comparisons show that our network is able to better detect distinctive features of target objects compared to DARTS.
translated by 谷歌翻译
This paper deals with the problem of statistical and system heterogeneity in a cross-silo Federated Learning (FL) framework where there exist a limited number of Consumer Internet of Things (CIoT) devices in a smart building. We propose a novel Graph Signal Processing (GSP)-inspired aggregation rule based on graph filtering dubbed ``G-Fedfilt''. The proposed aggregator enables a structured flow of information based on the graph's topology. This behavior allows capturing the interconnection of CIoT devices and training domain-specific models. The embedded graph filter is equipped with a tunable parameter which enables a continuous trade-off between domain-agnostic and domain-specific FL. In the case of domain-agnostic, it forces G-Fedfilt to act similar to the conventional Federated Averaging (FedAvg) aggregation rule. The proposed G-Fedfilt also enables an intrinsic smooth clustering based on the graph connectivity without explicitly specified which further boosts the personalization of the models in the framework. In addition, the proposed scheme enjoys a communication-efficient time-scheduling to alleviate the system heterogeneity. This is accomplished by adaptively adjusting the amount of training data samples and sparsity of the models' gradients to reduce communication desynchronization and latency. Simulation results show that the proposed G-Fedfilt achieves up to $3.99\% $ better classification accuracy than the conventional FedAvg when concerning model personalization on the statistically heterogeneous local datasets, while it is capable of yielding up to $2.41\%$ higher accuracy than FedAvg in the case of testing the generalization of the models.
translated by 谷歌翻译
Mapping the seafloor with underwater imaging cameras is of significant importance for various applications including marine engineering, geology, geomorphology, archaeology and biology. For shallow waters, among the underwater imaging challenges, caustics i.e., the complex physical phenomena resulting from the projection of light rays being refracted by the wavy surface, is likely the most crucial one. Caustics is the main factor during underwater imaging campaigns that massively degrade image quality and affect severely any 2D mosaicking or 3D reconstruction of the seabed. In this work, we propose a novel method for correcting the radiometric effects of caustics on shallow underwater imagery. Contrary to the state-of-the-art, the developed method can handle seabed and riverbed of any anaglyph, correcting the images using real pixel information, thus, improving image matching and 3D reconstruction processes. In particular, the developed method employs deep learning architectures in order to classify image pixels to "non-caustics" and "caustics". Then, exploits the 3D geometry of the scene to achieve a pixel-wise correction, by transferring appropriate color values between the overlapping underwater images. Moreover, to fill the current gap, we have collected, annotated and structured a real-world caustic dataset, namely R-CAUSTIC, which is openly available. Overall, based on the experimental results and validation the developed methodology is quite promising in both detecting caustics and reconstructing their intensity.
translated by 谷歌翻译
360-degree panoramic videos have gained considerable attention in recent years due to the rapid development of head-mounted displays (HMDs) and panoramic cameras. One major problem in streaming panoramic videos is that panoramic videos are much larger in size compared to traditional ones. Moreover, the user devices are often in a wireless environment, with limited battery, computation power, and bandwidth. To reduce resource consumption, researchers have proposed ways to predict the users' viewports so that only part of the entire video needs to be transmitted from the server. However, the robustness of such prediction approaches has been overlooked in the literature: it is usually assumed that only a few models, pre-trained on past users' experiences, are applied for prediction to all users. We observe that those pre-trained models can perform poorly for some users because they might have drastically different behaviors from the majority, and the pre-trained models cannot capture the features in unseen videos. In this work, we propose a novel meta learning based viewport prediction paradigm to alleviate the worst prediction performance and ensure the robustness of viewport prediction. This paradigm uses two machine learning models, where the first model predicts the viewing direction, and the second model predicts the minimum video prefetch size that can include the actual viewport. We first train two meta models so that they are sensitive to new training data, and then quickly adapt them to users while they are watching the videos. Evaluation results reveal that the meta models can adapt quickly to each user, and can significantly increase the prediction accuracy, especially for the worst-performing predictions.
translated by 谷歌翻译
Background samples provide key contextual information for segmenting regions of interest (ROIs). However, they always cover a diverse set of structures, causing difficulties for the segmentation model to learn good decision boundaries with high sensitivity and precision. The issue concerns the highly heterogeneous nature of the background class, resulting in multi-modal distributions. Empirically, we find that neural networks trained with heterogeneous background struggle to map the corresponding contextual samples to compact clusters in feature space. As a result, the distribution over background logit activations may shift across the decision boundary, leading to systematic over-segmentation across different datasets and tasks. In this study, we propose context label learning (CoLab) to improve the context representations by decomposing the background class into several subclasses. Specifically, we train an auxiliary network as a task generator, along with the primary segmentation model, to automatically generate context labels that positively affect the ROI segmentation accuracy. Extensive experiments are conducted on several challenging segmentation tasks and datasets. The results demonstrate that CoLab can guide the segmentation model to map the logits of background samples away from the decision boundary, resulting in significantly improved segmentation accuracy. Code is available.
translated by 谷歌翻译
Reinforcement learning (RL) gained considerable attention by creating decision-making agents that maximize rewards received from fully observable environments. However, many real-world problems are partially or noisily observable by nature, where agents do not receive the true and complete state of the environment. Such problems are formulated as partially observable Markov decision processes (POMDPs). Some studies applied RL to POMDPs by recalling previous decisions and observations or inferring the true state of the environment from received observations. Nevertheless, aggregating observations and decisions over time is impractical for environments with high-dimensional continuous state and action spaces. Moreover, so-called inference-based RL approaches require large number of samples to perform well since agents eschew uncertainty in the inferred state for the decision-making. Active inference is a framework that is naturally formulated in POMDPs and directs agents to select decisions by minimising expected free energy (EFE). This supplies reward-maximising (exploitative) behaviour in RL, with an information-seeking (exploratory) behaviour. Despite this exploratory behaviour of active inference, its usage is limited to discrete state and action spaces due to the computational difficulty of the EFE. We propose a unified principle for joint information-seeking and reward maximization that clarifies a theoretical connection between active inference and RL, unifies active inference and RL, and overcomes their aforementioned limitations. Our findings are supported by strong theoretical analysis. The proposed framework's superior exploration property is also validated by experimental results on partial observable tasks with high-dimensional continuous state and action spaces. Moreover, the results show that our model solves reward-free problems, making task reward design optional.
translated by 谷歌翻译
Structural Health Monitoring (SHM) describes a process for inferring quantifiable metrics of structural condition, which can serve as input to support decisions on the operation and maintenance of infrastructure assets. Given the long lifespan of critical structures, this problem can be cast as a sequential decision making problem over prescribed horizons. Partially Observable Markov Decision Processes (POMDPs) offer a formal framework to solve the underlying optimal planning task. However, two issues can undermine the POMDP solutions. Firstly, the need for a model that can adequately describe the evolution of the structural condition under deterioration or corrective actions and, secondly, the non-trivial task of recovery of the observation process parameters from available monitoring data. Despite these potential challenges, the adopted POMDP models do not typically account for uncertainty on model parameters, leading to solutions which can be unrealistically confident. In this work, we address both key issues. We present a framework to estimate POMDP transition and observation model parameters directly from available data, via Markov Chain Monte Carlo (MCMC) sampling of a Hidden Markov Model (HMM) conditioned on actions. The MCMC inference estimates distributions of the involved model parameters. We then form and solve the POMDP problem by exploiting the inferred distributions, to derive solutions that are robust to model uncertainty. We successfully apply our approach on maintenance planning for railway track assets on the basis of a "fractal value" indicator, which is computed from actual railway monitoring data.
translated by 谷歌翻译
As ride-hailing services become increasingly popular, being able to accurately predict demand for such services can help operators efficiently allocate drivers to customers, and reduce idle time, improve congestion, and enhance the passenger experience. This paper proposes UberNet, a deep learning Convolutional Neural Network for short-term prediction of demand for ride-hailing services. UberNet empploys a multivariate framework that utilises a number of temporal and spatial features that have been found in the literature to explain demand for ride-hailing services. The proposed model includes two sub-networks that aim to encode the source series of various features and decode the predicting series, respectively. To assess the performance and effectiveness of UberNet, we use 9 months of Uber pickup data in 2014 and 28 spatial and temporal features from New York City. By comparing the performance of UberNet with several other approaches, we show that the prediction quality of the model is highly competitive. Further, Ubernet's prediction performance is better when using economic, social and built environment features. This suggests that Ubernet is more naturally suited to including complex motivators in making real-time passenger demand predictions for ride-hailing services.
translated by 谷歌翻译